Need and Role of Scala Implementations in Bioinformatics

نویسندگان

  • Abbas Rehman
  • Ali Abbas
  • Muhammad Atif Sarwar
  • Javed Ferzund
چکیده

Next Generation Sequencing has resulted in the generation of large number of omics data at a faster speed that was not possible before. This data is only useful if it can be stored and analyzed at the same speed. Big Data platforms and tools like Apache Hadoop and Spark has solved this problem. However, most of the algorithms used in bioinformatics for Pairwise alignment, Multiple Alignment and Motif finding are not implemented for Hadoop or Spark. Scala is a powerful language supported by Spark. It provides, constructs like traits, closures, functions, pattern matching and extractors that make it suitable for Bioinformatics applications. This article explores the Bioinformatics areas where Scala can be used efficiently for data analysis. It also highlights the need for Scala implementation of algorithms used in Bioinformatics. Keywords—Scala; Big Data; Hadoop; Spark; Next Generation Sequencing; Genomics; RNA; DNA; Bioinformatics

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scala Roles - A Lightweight Approach Towards Reusable Collaborations

Purely class-based implementations of object-oriented software are often inappropriate for reuse. In contrast, the notion of objects playing roles in a collaboration has been proven to be a valuable reuse abstraction. However, existing solutions to enable role-based programming tend to require vast extensions of the underlying programming language, and thus, are difficult to use in every day wo...

متن کامل

An Open Framework for Extensible Multi-stage Bioinformatics Software

In research labs, there is often a need to customise software at every step in a given bioinformatics workflow, but traditionally it has been difficult to obtain both a high degree of customisability and good performance. Performance-sensitive tools are often highly monolithic, which can make research difficult. We present a novel set of software development principles and a bioinformatics fram...

متن کامل

Deciphering the functional role of hypothetical proteins from Chloroflexus aurantiacs J-10-f1 using bioinformatics approach

Chloroflexus aurantiacus J-10-f1 is an anoxygenic, photosynthetic, facultative autotrophic gram negative bacterium found from hot spring at a temperature range of 50-60°C. It can sustain itself in dark only if oxygen is available thereby exhibiting a dark orange color, however display a dark green color when grown in sunlight. Genome of the organism contains total of 3853 proteins out ...

متن کامل

Two Approaches to Portable Macros

For any programming language that supports macros and has multiple implementations (each with di‚erent AST definitions), there is a common problem: how to make macros that operate on ASTs portable among di‚erent compiler implementations? Implementing portable macros is especially important for statically typed languages like Scala, as IDE vendors usually have di‚erent implementations of the lan...

متن کامل

Bioinformatics to Biostochastics: Statistical Perspectives and Tasks Ahead

Bioinformatics is an emerging field of science emphasizing the application of mathematics, statistics, and informatics to study and analysis of very large molecular biological (mostly, genetic and genomic) systems (data sets). In a comparatively broader setup of large biological systems without necessarily having a predominant genetic undercurrent, and having genesis in biometry to biostatistic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017